Wideband speech and audio coding using gammatone filter banks
نویسندگان
چکیده
Considerable research attention has been directed towards speech and audio coding algorithms capable of producing high quality coded speech and audio, however few of these use signal representations which account for temporal as well as spectral detail. This paper presents a new technique for 16 kHz wideband speech and audio coding, whereby analysis and synthesis are performed using a linear phase gammatone filter bank. The outputs of these critical band filters are processed to obtain a series of pulse trains that represent neural firing. Auditory masking is then applied to reduce the number of pulses, producing a more compact time-frequency parameterization. The critical band gains and pulse amplitudes and positions are then coded using a combination of non-uniform quantization, arithmetic coding and vector quantization. This coding paradigm produces high quality coded speech and audio, is based upon well-known models of the auditory system, is highly scalable, and has moderate complexity.
منابع مشابه
A Gammatone-based Psychoacoustical Modeling Approach for Speech and Audio Coding
We propose a new approach for modeling auditory masking based on gammatone filters for application areas including speech/audio coding and audio watermarking. Besides the use of gammatone filters, this model differs from existing audio coding psychoacoustical models (e.g., the ones used in MPEG), in taking into account the contribution of a range of filters in computing the distortion, rather t...
متن کاملAn investigation of non-uniform bandwidths auditory filterbank in audio coding
This paper presents an investigation on the use of non-linear auditory filterbank in wideband audio coding. The perceptually based parameterization of the audio signal using gammatone filterbank is examined and discussed. Conventional gammatone filters requires high order FIR filters in the synthesis stage which introduces long delay and large computation cost. Here, a simple and efficient synt...
متن کاملAudlet Filter Banks: A Versatile Analysis/Synthesis Framework Using Auditory Frequency Scales
Many audio applications rely on filter banks (FBs) to analyze, process, and re-synthesize sounds. For these applications, an important property of the analysis–synthesis system is the reconstruction error; it has to be minimized to avoid audible artifacts. Other advantageous properties include stability and low redundancy. To exploit some aspects of auditory perception in the signal chain, some...
متن کاملWideband Speech Recovery Using Psychoacoustic Criteria
Manymodern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band.While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead ...
متن کاملAnalysis of Noise Characteristics in VMR-WB Speech Using Sub Band Filters
Speech coding has turned out to be one of the most vital techniques in telecommunications and in the multimedia communications. Existing speech coding techniques are appropriate only for inactive location and disgrace the speech quality. Voice quality improvement is an imperative characteristic to any speech communication scheme. Speech improvement and noise diminution techniques can significan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001